12 research outputs found

    An Enhanced Multiway Sorting Network Based on n-Sorters

    Full text link
    Merging-based sorting networks are an important family of sorting networks. Most merge sorting networks are based on 2-way or multi-way merging algorithms using 2-sorters as basic building blocks. An alternative is to use n-sorters, instead of 2-sorters, as the basic building blocks so as to greatly reduce the number of sorters as well as the latency. Based on a modified Leighton's columnsort algorithm, an n-way merging algorithm, referred to as SS-Mk, that uses n-sorters as basic building blocks was proposed. In this work, we first propose a new multiway merging algorithm with n-sorters as basic building blocks that merges n sorted lists of m values each in 1 + ceil(m/2) stages (n <= m). Based on our merging algorithm, we also propose a sorting algorithm, which requires O(N log2 N) basic sorters to sort N inputs. While the asymptotic complexity (in terms of the required number of sorters) of our sorting algorithm is the same as the SS-Mk, for wide ranges of N, our algorithm requires fewer sorters than the SS-Mk. Finally, we consider a binary sorting network, where the basic sorter is implemented in threshold logic and scales linearly with the number of inputs, and compare the complexity in terms of the required number of gates. For wide ranges of N, our algorithm requires fewer gates than the SS-Mk.Comment: 13 pages, 14 figure

    Composite Cyclotomic Fourier Transforms with Reduced Complexities

    Full text link
    Discrete Fourier transforms~(DFTs) over finite fields have widespread applications in digital communication and storage systems. Hence, reducing the computational complexities of DFTs is of great significance. Recently proposed cyclotomic fast Fourier transforms (CFFTs) are promising due to their low multiplicative complexities. Unfortunately, there are two issues with CFFTs: (1) they rely on efficient short cyclic convolution algorithms, which has not been investigated thoroughly yet, and (2) they have very high additive complexities when directly implemented. In this paper, we address both issues. One of the main contributions of this paper is efficient bilinear 11-point cyclic convolution algorithms, which allow us to construct CFFTs over GF(211)(2^{11}). The other main contribution of this paper is that we propose composite cyclotomic Fourier transforms (CCFTs). In comparison to previously proposed fast Fourier transforms, our CCFTs achieve lower overall complexities for moderate to long lengths, and the improvement significantly increases as the length grows. Our 2047-point and 4095-point CCFTs are also first efficient DFTs of such lengths to the best of our knowledge. Finally, our CCFTs are also advantageous for hardware implementations due to their regular and modular structure.Comment: submitted to IEEE trans on Signal Processin

    Extended Butterfly Networks

    No full text
    This paper defines a new network called the Extended Butterfly. The extended butterfly of degree n (XBn) has n 2 2 n nodes, diameter equal to ⌊3n/2 ⌋ and a constant node degree of 8. XBn is symmetric and contains n distinct copies of Bn. We also show that XBn supports all cycle subgraphs except those of odd lengths when n is even and of odd lengths less than n.

    Mapping cycles and trees on wrap-around butterfly graphs

    No full text
    Abstract. We give a new algebraic representation for the wrap-around butterfly interconnection network. This new representation is based on the direct product of groups and finite fields and allows an algebraic expression of the network connectivity. The abstract algebraic tools may then be employed to explore the structural properties of the butterfly. In this paper we exploit this model to map guest graphs on the butterfly. In particular, we provide designs of unit dilation mappings of all possible length cycles on butterflies. We also map the largest possible binary trees on butterfly networks with a dilation 2 if the network degree is less than 16, 3 if it is less than 32, and 4 if it is less than 64. This is a great improvement over previous results
    corecore